Goto

Collaborating Authors

 part 2



Two Causally Related Needles in a Video Haystack

Li, Miaoyu, Chao, Qin, Li, Boyang

arXiv.org Artificial Intelligence

Properly evaluating the ability of Video-Language Models (VLMs) to understand long videos remains a challenge. We propose a long-context video understanding benchmark, Causal2Needles, that assesses two crucial abilities insufficiently addressed by existing benchmarks: (1) extracting information from two separate locations (two needles) in a long video and understanding them jointly, and (2) modeling the world in terms of cause and effect in human behaviors. Causal2Needles evaluates these abilities using noncausal one-needle, causal one-needle, and causal two-needle questions. The most complex question type, causal two-needle questions, require extracting information from both the cause and effect events from a long video and the associated narration text. To prevent textual bias, we introduce two complementary question formats: locating the video clip containing the answer, and verbal description of a visual detail from that video clip. Our experiments reveal that models excelling on existing benchmarks struggle with causal 2-needle questions, and the model performance is negatively correlated with the distance between the two needles. These findings highlight critical limitations in current VLMs. The dataset is available at: https://huggingface.co/datasets/causal2needles/Causal2Needles




What's Behind the Magic? Audiences Seek Artistic Value in Generative AI's Contributions to a Live Dance Performance

Bruen, Jacqueline Elise, Jeon, Myounghoon

arXiv.org Artificial Intelligence

With the development of generative artificial intelligence (GenAI) tools to create art, stakeholders cannot come to an agreement on the value of these works. In this study we uncovered the mixed opinions surrounding art made by AI. We developed two versions of a dance performance augmented by technology either with or without GenAI. For each version we informed audiences of the performance's development either before or after a survey on their perceptions of the performance. There were thirty-nine participants (13 males, 26 female) divided between the four performances. Results demonstrated that individuals were more inclined to attribute artistic merit to works made by GenAI when they were unaware of its use. We present this case study as a call to address the importance of utilizing the social context and the users' interpretations of GenAI in shaping a technical explanation, leading to a greater discussion that can bridge gaps in understanding.


Part$^{2}$GS: Part-aware Modeling of Articulated Objects using 3D Gaussian Splatting

Yu, Tianjiao, Shah, Vedant, Wahed, Muntasir, Shen, Ying, Nguyen, Kiet A., Lourentzou, Ismini

arXiv.org Artificial Intelligence

Articulated objects are common in the real world, yet modeling their structure and motion remains a challenging task for 3D reconstruction methods. In this work, we introduce Part$^{2}$GS, a novel framework for modeling articulated digital twins of multi-part objects with high-fidelity geometry and physically consistent articulation. Part$^{2}$GS leverages a part-aware 3D Gaussian representation that encodes articulated components with learnable attributes, enabling structured, disentangled transformations that preserve high-fidelity geometry. To ensure physically consistent motion, we propose a motion-aware canonical representation guided by physics-based constraints, including contact enforcement, velocity consistency, and vector-field alignment. Furthermore, we introduce a field of repel points to prevent part collisions and maintain stable articulation paths, significantly improving motion coherence over baselines. Extensive evaluations on both synthetic and real-world datasets show that Part$^{2}$GS consistently outperforms state-of-the-art methods by up to 10$\times$ in Chamfer Distance for movable parts.


Accelerated Extragradient-Type Methods -- Part 2: Generalization and Sublinear Convergence Rates under Co-Hypomonotonicity

Tran-Dinh, Quoc, Nguyen-Trung, Nghia

arXiv.org Machine Learning

Following the first part of our project, this paper comprehensively studies two types of extragradient-based methods: anchored extragradient and Nesterov's accelerated extragradient for solving [non]linear inclusions (and, in particular, equations), primarily under the Lipschitz continuity and the co-hypomonotonicity assumptions. We unify and generalize a class of anchored extragradient methods for monotone inclusions to a wider range of schemes encompassing existing algorithms as special cases. We establish $\mathcal{O}(1/k)$ last-iterate convergence rates on the residual norm of the underlying mapping for this general framework and then specialize it to obtain convergence guarantees for specific instances, where $k$ denotes the iteration counter. We extend our approach to a class of anchored Tseng's forward-backward-forward splitting methods to obtain a broader class of algorithms for solving co-hypomonotone inclusions. Again, we analyze $\mathcal{O}(1/k)$ last-iterate convergence rates for this general scheme and specialize it to obtain convergence results for existing and new variants. We generalize and unify Nesterov's accelerated extra-gradient method to a new class of algorithms that covers existing schemes as special instances while generating new variants. For these schemes, we can prove $\mathcal{O}(1/k)$ last-iterate convergence rates for the residual norm under co-hypomonotonicity, covering a class of nonmonotone problems. We propose another novel class of Nesterov's accelerated extragradient methods to solve inclusions. Interestingly, these algorithms achieve both $\mathcal{O}(1/k)$ and $o(1/k)$ last-iterate convergence rates, and also the convergence of iterate sequences under co-hypomonotonicity and Lipschitz continuity. Finally, we provide a set of numerical experiments encompassing different scenarios to validate our algorithms and theoretical guarantees.


Using Large Language Models for Automated Grading of Student Writing about Science

Impey, Chris, Wenger, Matthew, Garuda, Nikhil, Golchin, Shahriar, Stamer, Sarah

arXiv.org Artificial Intelligence

Assessing writing in large classes for formal or informal learners presents a significant challenge. Consequently, most large classes, particularly in science, rely on objective assessment tools such as multiple-choice quizzes, which have a single correct answer. The rapid development of AI has introduced the possibility of using large language models (LLMs) to evaluate student writing. An experiment was conducted using GPT-4 to determine if machine learning methods based on LLMs can match or exceed the reliability of instructor grading in evaluating short writing assignments on topics in astronomy. The audience consisted of adult learners in three massive open online courses (MOOCs) offered through Coursera. One course was on astronomy, the second was on astrobiology, and the third was on the history and philosophy of astronomy. The results should also be applicable to non-science majors in university settings, where the content and modes of evaluation are similar. The data comprised answers from 120 students to 12 questions across the three courses. GPT-4 was provided with total grades, model answers, and rubrics from an instructor for all three courses. In addition to evaluating how reliably the LLM reproduced instructor grades, the LLM was also tasked with generating its own rubrics. Overall, the LLM was more reliable than peer grading, both in aggregate and by individual student, and approximately matched instructor grades for all three online courses. The implication is that LLMs may soon be used for automated, reliable, and scalable grading of student science writing.


Concentration of Cumulative Reward in Markov Decision Processes

Sayedana, Borna, Caines, Peter E., Mahajan, Aditya

arXiv.org Machine Learning

In this paper, we investigate the concentration properties of cumulative rewards in Markov Decision Processes (MDPs), focusing on both asymptotic and non-asymptotic settings. We introduce a unified approach to characterize reward concentration in MDPs, covering both infinite-horizon settings (i.e., average and discounted reward frameworks) and finite-horizon setting. Our asymptotic results include the law of large numbers, the central limit theorem, and the law of iterated logarithms, while our non-asymptotic bounds include Azuma-Hoeffding-type inequalities and a non-asymptotic version of the law of iterated logarithms. Additionally, we explore two key implications of our results. First, we analyze the sample path behavior of the difference in rewards between any two stationary policies. Second, we show that two alternative definitions of regret for learning policies proposed in the literature are rate-equivalent. Our proof techniques rely on a novel martingale decomposition of cumulative rewards, properties of the solution to the policy evaluation fixed-point equation, and both asymptotic and non-asymptotic concentration results for martingale difference sequences.


Tweet round up from #ECAI2024: part 2

AIHub

The 27th European Conference on Artificial Intelligence (ECAI-2024) took place from 19-24 October. Held in Santiago de Compostela, Spain, the event featured a full programme of technical papers, keynote and invited talks, workshops and tutorials, and panels. We took a look at what participants got up to over the second half of the event. AI Regulation: The European Scenario shed light on policy shifts, and The Economic Impact of AI discussed the challenges and opportunities ahead. Huge thanks to everyone who came to my #ecai2024 presentation and made it such a rewarding experience with your insightful questions and discussions!